home *** CD-ROM | disk | FTP | other *** search
-
-
- CURRENT_MEETING_REPORT_
-
-
- Reported by Jeffrey Mogul/DEC
-
- AGENDA
-
- (a) Report on current draft (McCloghrie/Fox/Mogul)
- (b) Review other alternatives
- (c) Review goals and assumptions
- (d) Obtain consensus on approach
- (e) Focus on details
- (f) What next?
-
- MINUTES
-
- This was the second meeting of the MTU Discovery Working Group.
-
- We started with a quick presentation by Keith McCloghrie of the draft
- that he and Rich Fox wrote based on the apparent consensus of the
- December meeting. Some attendees had not read the draft, and we tried
- to ensure that everyone understood the basic outline. [Summary:
- senders occasionally attach an IP PTMU-Query Option to their datagrams.
- Routers update the PMTU value in the option; the last-hop router returns
- the PMTU to the sender using the ICMP Path-MTU message. If the
- destination host detects a change in the MTU (when a fragment is
- received), it sends an ICMP Unexpected Fragment Report message.]
-
- We also reviewed the "Steve Deering" proposal from last year, as there
- was a realization that it might not be dead, after all. Among other
- things, we now know that there are not 1 but 4 spare bits in the IP
- header (there are 3 unused in the TOS field), and that the powers that
- be might therefore be likely to let us use one. [Summary of Deering
- proposal: senders often send datagrams with "RF" (Report Fragmentation)
- bit set in the IP header. A host receiving fragment-0 of a datagram
- with RF set sends an ICMP Fragmentation Occurred message.]
-
- We then started a fairly unstructured discussion comparing the costs and
- benefits of the two approaches.
-
- 1. Lifetime of protocol: on the one hand, in principle MTU discovery
- should be obviated by the coming revolution in routing protocols.
- Within "a few" years, the routing protocols will provide path-MTU
- information, so MTU discovery will be unnecessary. Of course, we
- all know about things that are supposed to happen "real soon now";
- we particularly all know about relatively new things that
- "everyone" implements. Still, while avoiding the trap of assuming
- that the world will be perfect in just a couple of years, it may
- not be worth trying to solve the problem of MTU discovery for all
- time, since it may not be useful for that long.
- 2. Rapidity of deployment: Clearly, MTU discovery of any form only
- works for a sender if some subset of the other nodes (routers
-
- 1
-
-
-
-
-
-
- and/or destinations) suport it. Query-based schemes depend upon
- support from a large fraction of the routers; RF-style schemes only
- help if a large fraction of the end-hosts support it. There was
- some debate about which population is more likely to upgrade soon
- (routers or end-hosts). No consensus was reached.
- 3. Connection lifetimes: Van's data suggest that most non-local TCP
- connections are short (ca. 4 datagrams). This makes some sense
- (mostly SMTP) although this is only one sample point, and we agreed
- that more data would be useful. Van argued that this works against
- a query-based scheme, since by the time one has useful information,
- there's not much left to do with it. His argument in favor of the
- RF scheme was that the right way to use it is to assume that you
- can send large datagrams (sized by your first-hop MTU, or perhaps
- some estimate of the NSFNET PMTU, ca. 1500), and let the
- destination tell you if you are screwing up.
- In general, we realize that fragmentation is not inherently evil.
- Although it might create some extra overhead for the routers, what
- we really have to avoid is the "deterministic fragment loss"
- problem which causes connections to stall. Thus, (I hope I am
- correctly paraphrasing Van's argument) it is only worth doing for
- connections that last a while, either because they are carrying
- lots of data, or because they are stalled due to fragment loss.
- Query-based schemes waste router resources because processing IP
- options is expensive, and the payoff is unlikely.
- It was argued that, since the senders cache the MTU values learned
- by either scheme in the per-host routing entries, querying would
- not have to be done on every connection to be useful. Again, Van
- drew on his traffic studies to suggest that (even over a 12-hour
- period) there was generally little correlation between connections
- ... that is, just because one pair of hosts makes a connection
- does not mean that they will do so any time soon. Some of us did
- not believe that is necessarily true (for example, how much traffic
- comes from mail-hub machines like DECWRL and UUNET?) Again, we
- agreed that it would be nice to have more traffic data available.
- 4. Complexity: Now that the draft specification for the query-based
- scheme is done, we realized that it is a lot more complex than we
- thought. One problem is the number of tunable parameters. Since
- the RF scheme doesn't require the receiver to maintain any state
- about the sender [actually, this is not quite true, as noted
- later], doesn't require the sender to schedule when to send the
- option, doesn't cause the receiver to send notifications when
- intentional fragmentation occurs [NFS would probably not set RF],
- and it requires no support at all from the routers, it appears to
- be simpler [but keep reading].
-
- After this discussion, it was pretty clear that the consensus had
- shifted to trying to use the RF scheme. We made the assumption that we
- could get a header bit (Van argued that although the RF scheme could be
- done using an option, the cost/benefit analysis might be against it).
- The next step was to explore how well that would really work.
-
- One problem that came up right away is that James VanBokkelen believes
- there to exist many PC-based systems that (1) do not reassemble
-
- 2
-
-
-
-
-
-
- fragments (2) do advertise MSS values of 1500 to non-local peers
- Currently, these hosts function because the 576-if-nonlocal rule
- observed by most non-PC hosts means that, given today's Internet, even
- when they advertise an MTU of 1500 to a non-local host, the host at the
- other end will not send datagrams big enough to be fragmented. [I
- suppose it is unlikely for two PCs to talk to each other over long
- distances.] However, if we use the simplest RF scheme, these hosts are
- going to get fragmented datagrams. Since we assume that any host which
- implements MTU discovery is also in conformance with the other rules
- (specifically, fragmentation reassembly), we therefore know that such
- sub-standard PCs won't send the ICMP Fragmentation Occurred message, and
- these connections would stall.
-
- The obvious fix is to not invoke MTU discovery (i.e., not send segments
- > 576 bytes) unless you are sure that the other end supports it. This
- means that you have to have seen a datagram with RF set coming back to
- you from the destination before you can send large datagrams.
-
- More subtly, since we don't want to mislead these stupid PCs (which
- apparently don't follow the 576-byte rule in either direction) you
- cannot even send an MSS > 576 to a non-local peer until you have seen an
- RF bit from it. Thus, since the TCP MSS option can only be sent on the
- SYN datagram, a host initiating a TCP connection may not be able to use
- MTU discovery (and large segments) unless it has talked with the other
- end recently. (The second host is in a better position; since it sees
- the RF bit before it has to sends its own MSS option, it can set a large
- MSS immediately. This is nice for FTP retrieves; it doesn't help for
- SMTP, alas).
-
- The consensus was that this limitation was acceptable, since it erred on
- the conservative side. (Although it errs on the case of the most common
- connection-type [SMTP], since SMTP connections are normally short we
- wouldn't gain much anyway.) When two connections are made in quick
- succession, things work nicely (e.g., several mail messages, or the
- control connection of an FTP session followed by the data connection.
- The control connection will seldom carry large segments, but the
- exchange of RF bits done then will allow the data connection to use
- large segments right away.)
-
- Mike Karels proposed (off-the-cuff, not necessarily believing that it
- was right) that routers fragmenting a datagram with RF set could also
- send the fragmentation-occurred ICMP. This seemed to create problems
- given the requirement for handshaking imposed by the broken-PC crowd, so
- Mike agreed to go off and think about this one.
-
- One question arose about the use of a previously unused bit in the IP
- header: what would current implementations do if they see it set? (We
- know that we can safely add options, since by definition these are
- ignored if not known.) While the IP spec says these bits must be zero,
- the "robustness principle" implies that routers and hosts should ignore
- them. Unfortunately, John Moy from Proteon admitted that Proteon
- routers drop such datagrams, and Noel Chiappa says that this is true of
- other implementations based on his old MIT "C-gateway" code. We have to
-
- 3
-
-
-
-
-
-
- find out just how bad this is going to be; perhaps Proteon will be able
- to upgrade all of its customers before MTU discovery is widely
- implemented.
-
- [Side note: Clearly, implementations contrary to the basic IP spec are
- causing us serious grief. How much do we twist the protocol to
- accomodate them?]
-
- An orthogonal issue is that in high-speed long-distance networks, there
- might be lots of packets in flight when the route changes to one with a
- lower MTU (e.g., on a satellite link with a half-second RTT, 4kb
- packets, and 100 Mbit/sec channel, this means 1500 packets per RTT!)
- Since the source cannot react to a Fragment Occurred message sooner than
- one RTT worth of packets after the one that triggered the message, we
- are concerned that setting the RF bit on every packet could lead to
- positive (i.e., anti-stability) feedback in a network that is loosing
- capacity.
-
- This could be attacked in two ways: limit the rate at which the RF bit
- is sent, or limit the rate at which the ICMP is sent. The former could
- be done "once per RTT", once per some constant time period, or perhaps
- once per window. It's not clear if there is a convenient way of marking
- out the boundaries between windows
-
- ACTION ITEMS
-
-
- 1. Noel Chiappa and Van Jacobson were assigned to try to get the IESG
- to free up an IP header bit.
- 2. Mike Karels was going to think more about having routers send ICMPs
- when they fragment.
- 3. We need to determine how many routers will drop packets with RF
- set, and how hard it will be to fix this. Is it any different if
- we use one of the bits in the TOS area?
- 4. Ditto for end-hosts; are there any that drop such packets?
- 5. The Router Requirements WG was known to be considering changing the
- way that fragmentation was done (fragment into equal-size pieces;
- currently, routers are supposed to send N maximal-size fragments
- and one smaller one). This would make the RF scheme nearly
- useless. [Phil Almquist says that the RRWG will work with us on
- this, so it shouldn't be a problem].
- 6. Perhaps more traffic studies would be useful.
- 7. Someone has to write the next draft. Keith and Rich were thanked
- for their hard work, on their draft that is now tabled, and were
- not coerced into starting a different document. Since Van was the
- fiercest proponent of RF at the meeting, he was given
- responsibility to see to it that the draft is written. He agreed
- but said he was going to try to get Steve Deering to do the work
- (Steve was absent due to serious thesis time-pressure, so maybe Van
- is going to be stuck with it.) The chair requested a draft within
- one month (7 March 1990).
- 8. James VanBokkelen was going to see just how many hosts out there
-
- 4
-
-
-
-
-
-
- are unable to reassemble fragmented IPs, how hard it would be to
- fix this, how many vendors are involved, etc.
-
-
- IESG ACTION
-
- On Thursday, February 8, at the open IESG meeting, the IESG was asked to
- allow this bit to be used for MTU discovery. I was not there, but I
- understand that the IESG is willing to release this bit if we come to a
- consensus on a protocol that they think is reasonable.
-
- SCHEDULE
-
- We expect to meet again at the May IETF meeting.
-
- At that point, we will probably either adopt one of the schemes, or give
- up.
-
-
-
- 5
-
-
-
-
-
-
- ATTENDEES
-
- Ballard Bare bare%hprnd@hplabs.hp.com
- Art Berggreen art@sage.acc.com
- Richard Bosch probe@mit.edu
- Ron Broersma ron@nosc.mil
- John Cavanaugh John.Cavanaugh@StPaul.ncr.com
- Noel Chiappa jnc@LCS.MIT.EDU
- James Davin jrd@ptt.lcs.mit.edu
- Farokh Deboo sun!iruucp!ntrlink!fjd
- Rich Fox sytek!rfox@sun.com
- Van Jacobson van@lbl-csam.arpa
- Mike Karels karels@berkeley.edu
- Mike Marcinkevicz mdm@gumby.dsd.trw.com
- Tony Mason mason@transarc.com
- Keith McCloghrie sytek!kzm@hplabs.HP.COM
- Bill Melohn melohn@sun.com
- Jeff Mogul mogul@decwrl.dec.com
- John Moy jmoy@proteon.com
- Drew Perkins ddp@andrew.cmu.edu
- Michael Petry petry@trantor.umd.edu
- Nuggehalli Pradeep pradeep@orville.nas.nasa.gov
- Mark Rosenstein mar@athena.mit.edu
- Tony Staw staw@marvin.enet.dec.com
- James VanBokkelen jbvb@ftp.com
- John Veizades veizades@apple.com
- Steve Willis swillis@wellfleet.com
- John Wobus JMWobus@suvm.acs.syr.edu
- David Zimmerman dpz@convex.com
-
-
-
- 6
-